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(57) Abstract: Measuring the loudness of audio encoded in a bitstream that includes data from which an approximation of the 
power spectrum of the audio can be derived without fully decoding the audio is performed by deriving the approximation of the 
power spectrum of the audio from said bitstream without fully decoding the audio, and determining an approximate loudness of 
the audio in response to the approximation or the power spectrum of the audio. The data may include coarse representations of the 
fT) audio and associated finer representations of the audio, the approximation of the power spectrum of the audio being derived from 
the coarse representations of the audio. In the case of subband encoded audio, the coarse representations of the audio may comprise 
scale factors and the associated finer representations of the audio may comprise sample data associated with each scale factor. 
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Description 

Economical Loudness Measurement of Coded Audio 

Technical Field 

5 The invention relates to audio signal processing. More particularly, it 

relates to an economical calculation of an objective loudness measure of 
low-bitrate coded audio such as audio coded using Dolby Digital (AC-3), 
Dolby Digital Plus, or Dolby E. "Dolby", "Dolby Digital", "Dolby Digital 
Plus", and "Dolby E" are trademarks of Dolby Laboratories Licensing 
10 Corporation. Aspects of the invention may also be usable with other types of 
audio coding. 

Background Art 
Details of Dolby Digital coding are set forth in the following 
references: 

15 ATSC Standard A52/A: Digital Audio Compression Standard (AC-3), 

Revision A, Advanced Television Systems Committee, 20 Aug. 2001. The 
A/52A document is available on the World Wide Web at 
http://w\\ f w.atsc.org/standards.html . 

Flexible Perceptual Coding for Audio Transmission and Storage," by 
20 Craig C. Todd, et al, 96 th Convention of the Audio Engineering Society, 
February 26, 1994, Preprint 3796; 

"Design and Implementation of AC-3 Coders," by Steve Vernon, 
IEEE Trans. Consumer Electronics , Vol. 41, No. 3, August 1995. 

"The AC-3 Multichannel Coder" by Mark Davis, Audio Engineering 
25 Society Preprint 3774, 95th AES Convention, October, 1993. 

"High Quality, Low-Rate Audio Transform Coding for Transmission 
and Multimedia Applications," by Bosi et al, Audio Engineering Society 
Preprint 3365, 93rd AES Convention, October, 1992. 
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United States Patents 5,583,962; 5,632,005; 5,633,981; 5,727,119; 
5,909,664; and 6,021,386. 

Details of Dolby Digital Plus coding are set forth in "Introduction to 
Dolby Digital Plus, an Enhancement to the Dolby Digital Coding System," 
5 AES Convention Paper 6196, 1 17 th AES Convention, October 28, 2004. 

Details of Dolby E coding are set forth in "Efficient Bit Allocation, 
Quantization, and Coding in an Audio Distribution System", AES Preprint 
5068, 107th AES Conference, August 1999 and "Professional Audio Coder 
Optimized for Use with Video", AES Preprint 5033, 107th AES Conference 
10 August 1999. 

An overview of various perceptual coders, including Dolby encoders, 
MPEG encoders, and others is set forth in "Overview of MPEG Audio: 
Current and Future Standards for Low-Bit-Rate Audio Coding," by 
Karlheinz Brandenburg and Marina Bosi, J. Audio Eng. Soc, Vol. 45, No. 
15 1/2, January /February 1997. 

All of the above-cited references are hereby incorporated by reference, 
each in its entirety. 

Many methods exist for objectively measuring the perceived loudness 
of audio signals. Examples of methods include weighted power measures 
20 (such as LeqA, LeqB, LeqC) as well as psychoacoustic-based measures of 
loudness such as "Acoustics — Method for Calculating Loudness Level," 
ISO 532 (1975). Weighted power loudness measures process the input audio 
signal by applying a predetermined filter that emphasizes more perceptibly 
sensitive frequencies while deemphasizing less perceptibly sensitive 
25 frequencies, and then averaging the power of the filtered signal over a 

predetermined length of time. Psychoacoustic methods are typically more 
complex and aim to model better the workings of the human ear. This is 
achieved by dividing the audio signal into frequency bands that mimic the 
frequency response and sensitivity of the ear, and then manipulating and 



PCT/US20U6/010823 

WO 2006/113047 PCT/DS2006/0 10823 

-3- 

integrating these bands while taking into account psychoacoustic 
phenomenon such as frequency and temporal masking, as well as the non- 
linear perception of loudness with varying signal intensity. The aim of all 
objective loudness measurement methods is to derive a numerical 
measurement of loudness that closely matches the subjective perception of 
loudness of an audio signal. 

Perceptual coding or low-bitrate audio coding is commonly used to 
data compress audio signals for efficient storage, transmission and delivery 
in applications such as broadcast digital television and the online Internet 
sale of music. Perceptual coding achieves its efficiency by transforming the 
audio signal into an information space where both redundancies and signal 
components that are psychoacoustically masked can be easily discarded. 
The remaining information is packed into a stream or file of digital 
information. Typically, measuring the loudness of the audio represented by 
low-bitrate coded audio requires decoding the audio back into the time 
domain (e.g., PCM), which can be computationally intensive. However, 
some low-bitrate perceptual-coded signals contain information that may be 
useful to a loudness measurement method, thereby saving the computational 
cost of fully decoding the audio. Dolby Digital (AG-3), Dolby Digital Plus, 
and Dolby E are among such audio coding systems. 

The Dolby Digital, Dolby Digital Plus, and Dolby E low-bitrate 
perceptual audio coders divide audio signals into overlapping, windowed 
time segments (or audio coding blocks) that are transformed into a frequency 
domain representation. The frequency domain representation of spectral 
coefficients is expressed by an exponential notation comprising sets of an 
exponent and associated mantissas. The exponents, which function in the 
manner of scale factors, are packed into the coded audio stream. The 
mantissas represent the spectral coefficients after they have been normalized 
by the exponents. The exponents are then passed through a perceptual 
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model of hearing and used to quantize and pack the mantissas into the coded 
audio stream. Upon decoding, the exponents are unpacked from the coded 
audio stream and then passed through the same perceptual model to 
determine how to unpack the mantissas. The mantissas are then unpacked, 
5 combined with the exponents to create a frequency domain representation of 
the audio that is then decoded and converted back to a time domain 
representation. 

Because many loudness measurements include power and power 
spectrum calculations, computational savings may be achieved by only 

10 partially decoding the low-bitrate coded audio and passing the partially 
decoded information (such as the power spectrum) to the loudness 
measurement. The invention is useful whenever there is a need to measure 
loudness but not to decode the audio. It exploits the fact that a loudness 
measurement can make use of an approximate version of the audio, such 

15 approximation not usually being suitable for listening. An aspect of the 

present invention is the recognition that a coarse representation of the audio, 
which is available without fully decoding a bitstream in many audio coding 
systems, can provide an approximation of the audio spectrum that is usable 
in measuring the loudness of the audio. In Dolby Digital, Dolby Digital 

20 Plus, and Dolby E audio coding, exponents provide an approximation of the 
power spectrum of the audio. Similarly, in certain other coding systems, 
scale factors, spectral envelopes, and linear predictive coefficients may 
provide an approximation of the power spectrum of the audio. These and 
other aspects and advantages of the invention will be better understood as the 

25 folloAying summary and description of the invention are read and understood. 

The invention provides a computationally economical measurement of 
the perceived loudness of low-bitrate coded audio. This is achieved by only 
partially decoding the audio material and by passing the partially decoded 
information to a loudness measurement. The method takes advantage of 



PCT/US2006/010823 

WO 2006/113047 PCT/US2006/010823 

-5- 

specific properties of the partially decoded audio information such as the 
exponents in Dolby Digital, Dolby Digital Plus, and Dolby E audio coding. 

A first aspect of the invention measures the loudness of audio encoded 
in a bitstream that includes data from which an approximation of the power 

5 spectrum of the audio can be derived without fully decoding the audio by 
deriving the approximation of the power spectrum of the audio from the 
bitstream without fully decoding the audio, and determining an approximate 
loudness of the audio in response to the approximation of the power 
spectrum of the audio. 

10 In another aspect of the invention, the data may include coarse 

representations of the audio and associated finer representations of the audio, 
in which case the approximation of the power spectrum of the audio may be 
derived from the coarse representations of the audio. 

In a further aspect of the invention, the audio encoded in a bitstream 

1 5 may be subband encoded audio having a plurality of frequency subbands, 

each subband having a scale factor and sample data associated therewith, and 
in which the coarse representations of the audio comprise scale factors and 
the associated finer representations of the audio comprise sample data 
associated with each scale factor. 

20 In yet a further aspect of the invention, the scale factor and sample 

data of each subband may represent spectral coefficients in the subband by 
exponential notation in which the scale factor comprises an exponent and the 
associated sample data comprises mantissas. 

In yet a further aspect of the invention, the audio encoded in a 

25 bitstream may be linear predictive coded audio in which the coarse 

representations of the audio comprise linear predictive coefficients and the 
finer representations of the audio comprise excitation information associated 
with the linear predictive coefficients. 
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In still a further aspect of the invention, the coarse representations of 
the audio may comprise at least one spectral envelope and the finer 
representations of the audio may comprise spectral components associated 
with the at least one spectral envelope. 

In still yet a further aspect of the invention, determining an 
approximate loudness of the audio in response to the approximation of the 
power spectrum of the audio may include applying a weighted power 
loudness measure. The weighted power loudness measure may employ a 
filter that deemphasizes less perceptible frequencies and averages the power 

of the filtered audio over time. 

In yet another aspect of the invention, determining an approximate 
loudness of the audio in response to the approximation of the power 
spectrum of the audio may include applying a psychoacoustic loudness 
measure. The psychoacoustic loudness measure may employ a model of the 
human ear to determine specific loudness in each of a plurality of frequency 
bands similar to the critical bands of the human ear. In a subband coder 
environment, the subbands may be similar to the critical bands of the human 
ear and the psychoacoustic loudness measure may employ a model of the 
human ear to determine specific loudness in each of the subbands. 

Aspects of the invention include methods practicing the above 
functions, means practicing the functions, apparatus practicing the methods, 
and a computer program, stored on a computer-readable medium for causing 
a computer to perform the methods practicing the above functions. 

Description of the Drawings 

FIG. 1 shows a schematic functional block diagram of a general 
arrangement for measuring the loudness of low-bitrate coded audio. 

FIG. 2 shows a generalized schematic functional block diagram of a 
Dolby Digital, a Dolby Digital Plus, and a Dolby E decoder. 
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FIGS. 3 a and 3b show schematic functional block diagrams of two 
general arrangements for calculating an objective loudness measure using 
weighted power and psychoacoustically-based measures, respectively. 

FIG. 4 shows common frequency weightings used when measuring 
5 loudness according to the arrangement of the example of FIG. 3a. 

FIGS, 5 is a schematic functional block diagram showing a more 
economical general arrangement for measuring the loudness of coded audio 
in accordance with aspects of the invention. 

FIGS. 6a and 6b are schematic functional block diagrams of the more 

■ 

10 economical arrangement for measuring loudness incorporating the loudness 
arrangements shown in the examples of FIGS. 3a and 3b in accordance with 
aspects of the invention. 

Best Mode for Carrying out the Invention 
A benefit of aspects of the present invention is the measurement of the 

15 loudness of low-bitrate coded audio without the need to decode fully the 
audio to PCM, which decoding includes expensive decoding processing 
steps such as bit allocation, de-quantization, an inverse transformation, etc. 
Aspects of the invention greatly reduce the processing requirements 
(computational overhead). This approach is beneficial when a loudness 

20 measurement is desired but the decoded audio is not needed. 

m 

Aspects of the present invention are usable, for example, in 
environments such as disclosed in (1) pending United States Non- 
Provisional Patent Application S.N. 10/884,177, filed July 1, 2004; entitled 
"Method for Collecting Metadata Affecting the Playback Loudness and 
25 Dynamic Range of Audio Information," by Smithers et al; (2) United States 
Patent Provisional Application S.N. 60/xxx,xxx, filed the same day as the 
present application, entitled "Audio Metadata Verification/' by Brett 
Graham Crockett, Attorneys' Docket DOL150, and (3) and in the 
performance of loudness measurement and correction in a broadcast storage 
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or transmission chain in which access to the decoded audio is not needed and 
is not desirable. Said S.N. 10/884,177 and said Attorneys' Docket DOL150 
applications are hereby incorporated by reference in their entirety. 

The processing savings provided by aspects of the invention also help 

5 make it possible to perform loudness measurement and metadata correction 
(e.g., changing a DIALNORM parameter to the correct value) in real time on 
a large number of low-bitrate data compressed audio signals. Often, many 
low-bitrate coded audio signals are multiplexed and transported in MPEG 
transport streams. The loudness measurement according to aspects of the 

10 present invention makes loudness measurement in real time on a large 

number of compressed audio signals much more feasible when compared to 
the requirements of fully decoding the compressed audio signals to PCM to 
perform the loudness measurement. 

FIG. 1 shows a prior art arrangement for measuring the loudness of 

15 coded audio. Coded digital audio data or information 101, such as audio that 
has been low-bitrate encoded, is decoded by a decoder or decoding function 
("Decode") 102 into, for example, a PCM audio signal 103. This signal is 
then applied to a loudness measurer or measuring method or algorithm 
("Measure Loudness") 104 that generates a measured loudness value 105. 

20 FIG. 2 shows a prior art structural or functional block diagram of an 

example of a Decode 102. The structure or functions it shows are 
representative of Dolby Digital, Dolby Digital Plus, and Dolby E decoders. 
Frames of coded audio data 1 0 1 are applied to a data unpacker or unpacking 
function ("Frame Sync, Error Detection & Frame Deformatting") 202 that 

25 unpacks the applied data into exponent data 203, mantissa data 204, and 
other miscellaneous bit allocation information 207. The exponent data 203 
is converted inlo a log power spectrum 206 by a device or function ("Log 
Power Spectrum") 205 and this log power spectrum is used by a bit allocator 
or bit allocation function ("Bit Allocation") 208 to calculate signal 209, 
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which is the length, in bits, of each quantized mantissa. The mantissas are 
then de-quantized and combined with the exponents by a device or function 
("De-Quantize Mantissas") 210 and converted back to the time domain by an 
inverse filterbank device or function ("Inverse Filterbank") 212. Inverse 

5 Filterbank 2 1 2 also overlaps and sums a portion of the current Inverse 
Filterbank result with the previous Inverse Filterbank result (in time) to 
create the decoded audio signal 103. In practical decoder implementations, 
significant computing resources are required by the Bit Allocation, De- 
Quantize Mantissas and Inverse Filterbank devices or functions. More 

10 details of the decoding process may be found in ones of the above-cited 
references. 

FIGS. 3a and 3b show prior art arrangements for objectively 
measuring the loudness of an audio signal. These represent variations of the 
Measure Loudness 104 (FIG. 1). Although FIGS. 3a and 3b show examples, 
15 respectively of two general categories of objective loudness measuring 
techniques, the choice of a particular objective measuring technique is not 
critical to the invention and other objective loudness measuring techniques 

may be employed. 

FIG. 3a shows an example of the weighted power measure 

20 arrangement commonly used in loudness measuring. An audio signal 103 is 
passed through a weighting filter or filtering function ("Weighting Filter") 
302 that is designed to emphasize more perceptibly sensitive frequencies 
while deemphasizing less perceptibly sensitive frequencies. The power 305 
of the filtered signal 303 is calculated by a device or function ("Power") 304 

25 and averaged over a defined time period by a device or function ("Average") 
306 to create a loudness value 105. A number of different standard 
weighting filter characteristics exist and some common examples are shown 
in FIG. 4. In practice, modified versions of the FIG. 3a arrangement are 
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often used, the modifications, for example, preventing time periods of 
silence from being included in the average. 

Psychoacoustic-based techniques are often also used to measure 
loudness. FIG. 3b shows a typical prior art arrangement of such a 
5 psychoacoustic-based arrangement. An audio signal 103 is filtered by a 
transmission filter or filtering function ("Transmission Filter") 312 that 
represents the frequency-varying magnitude response of the outer and 
middle ear. The filtered signal 313 is then separated by an auditoiy 
filterbank or filterbank function ("Auditoiy Filterbank") 314 into frequency 
10 bands that are equivalent to, or narrower than, auditory critical bands. This 

* 

may be accomplished by performing a fast Fourier transform (FFT) (as 
implemented, for example, by a discrete frequency transform (DFT)) and 
then grouping the linearly spaced bands into bands approximating the ear's 
critical bands (as in an ERB or Bark scale). Alternatively, this may be 

1 5 accomplished by a single bandpass filter for each ERB or Bark band. Each 
band is then converted by a device or function ("Excitation") 316 into an 
excitation signal 317 representing the amount of stimuli or excitation 
experienced by the human ear within the band. The perceived loudness or 
specific loudness for each band is then calculated from the excitation by a 

20 device or function ("Specific Loudness") 318 and the specific loudness 

across all bands is summed by a summer or summing function ("Sum") 320 
to create a single measure of loudness 105. The summing process may take 
into consideration various perceptual effects, for example frequency 
masking. In practical implementations of these perceptual methods, 

25 significant computational resources are required for the transmission filter 

and auditoiy filterbank. 

FIG. 5 shows a block diagram of an aspect of the present invention. A 
coded digital audio signal 101 is partially decoded by a device or function 
("Partial Decode") 502 and the loudness is measured from the partially 



PCT7US2006/010823 

WO 2006/113047 PCT/US2006/010823 

- 11 - 

decoded information 503 by a device or function ("Measure Loudness") 504. 
Depending on how the partial decoding is performed, the resulting loudness 
measure 505 may be very similar to, but not exactly the same as, the 
loudness measure 105 calculated from the completely decoded audio signal 
5 103 (FIG. 1). In the context of Dolby Digital, Dolby Digital Plus and Dolby 
E implementations of aspects of the invention, partial decoding may include 
the omission of the Bit Allocation, De-Quantize Mantissas and Inverse 
Filterbank devices or functions from a decoder such as the example of FIG. 

n 

1 0 FIGS. 6a and 6b show two examples of implementations of the 

general arrangement of FIG. 5. Although both may employ the same Partial 
Decode 502 function or device, each may have a different Measure Loudness 
504 function or device - that in the FIG. 6a example being similar to the 
example of FIG. 3a and that in the FIG. 6a example being similar to the FIG. 

15 6b example, hi both examples, the Partial Decode 502 extracts only the 

exponents 203 from the coded audio stream and converts the exponents to a 
power spectrum 206. Such extraction may be performed by a device or 
function ("Frame Sync, Error Detection & Frame De-Formatting") 202 as in 
the FIG. 2 example and such conversion may be performed by a device or 

20 function ("Log Power Spectrum") 205 as in the FIG. 2 example. There is no 
requirement to de-quantize the mantissas, perform bit allocation, and 
perform an inverse filterbank as would be required for a full decoding as 
shown in the decoding example of FIG. 2. 

The example of FIG. 6a includes a Measure Loudness 504, which may 

25 be a modified version of the loudness measurer or loudness measuring 
function of FIG. 3a. In this example, a modified weighting filtering is 
applied in the frequency domain by increasing or decreasing the power 
values in each band by a weighting filter or weighted filtering function 
("Modified Weighting Filter") 601. In contrast, the FIG. 3a example applies 
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weighting filtering in the time domain. Although it operates in the frequency 
domain, the Modified Weighting Filter affects the audio in the same way as 
the time-domain Weighting Filter of Fig. 3a. The filter 601 is "modified" 
with respect to filter 302 of Fig. 3a in the sense that it operates on log 

5 amplitude values rather than linear values and it operates on a non-linear 
rather than a linear frequency scale. The frequency weighted power 
spectrum 602 is then converted to linear power and summed across 
frequency and averaged across time by a device or function ("Convert, Sum 
& Average") 603 applying, for example, Equation 5, below. The output is 

10 an objective loudness value 505. 

The example of FIG. 6b includes a Measure Loudness 504, which may 
be a modified version of the loudness measurer or loudness measuring 
function of FIG. 3b. In this example, a modified transmission filter or 
filtering function (Modified Transmission Filter") 61 1 is applied directly in 

1 5 the frequency domain by increasing or decreasing the log power values in 
each band. In contrast, the FIG. 3b example applies weighting filtering in 
the time domain. Although it operates in the frequency domain, the 
Modified Transmission Filter affects the audio in the same way as the time- 
domain Transmission Filter of Fig. 3b. A modified auditory filterbank or 

20 filterbank function ("Modified Auditory Filterbank") 6 1 3 accepts as input 
the linear frequency band spaced log power spectrum and splits or combines 
these linearly spaced bands into a critical-band-spaced (e.g., ERB or Bark 
bands) filterbank output 315. Modified Auditoiy Filterbank 6 1 3 also 
converts the log-domain power signal into a linear signal for the following 

25 excitation device or function ("Excitation") 316. The Modified Auditory 
Filterbank 6 1 3 is "modified" with respect to the Auditoiy Filterbank 314 of 
FIG. 3b in that it operates on log amplitude values rather than linear values 
and converts such log amplitude values into linear values. Alternatively, the 
grouping of bands into ERB or Bark bands may be performed in the 
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Modified Auditory Filterbank 613 rather than the Modified Transmission 
Filter 61 1. The example of FIG. 6b also includes a Specific Loudness 318 
for each band and a Sum 320 as in the example of FIG. 3b. 

For the arrangements shown in FIGS. 6a and 6b, significant 
5 computational savings are achieved because the decoding does not require 
bit allocation, mantissa de-quantization and an inverse filterbank. However, 
for both the FIG. 6a and FIG. 6b arrangements, the resulting objective 
loudness measurement may not be exactly the same as the measurement 
calculated from fully decoded audio. This is because some of the audio 

10 information is discarded and thus the audio information used for the 
measurement is incomplete. When aspects of the present invention are 
applied to Dolby Digital, Dolby Digital Plus, or Dolby E, the mantissa 
information is discarded and only the coarsely quantized exponent values are 
retained. For X)olby Digital and Dolby Digital Plus the values are quantized 

1 5 to increments of 6 dB and for Dolby E they are quantized to increments of 3 
dB. The smaller quantization steps in Dolby E result in finer quantized 
exponent values and, consequently, a more accurate estimate of the power 
spectrum. 

Perceptual coders are often designed to alter the length of the 
20 overlapping time segments, also called the block size, in conjunction with 
certain characteristics of the audio signal. For example Dolby Digital uses 
two block sizes — a longer block of 512 samples predominantly for 
stationary audio signals and a shorter block of 256 samples for more 
transient audio signals. The result is that the number of frequency bands and 
25 corresponding number of log power spectrum values 206 varies block by 
block. When the block size is 512 samples, there are 256 bands, and when 
the block size is 256 samples, there are 128 bands. 

There are many ways that the proposed methods in FIGS. 6a and 6b 
may handle varying block sizes and each way leads to a similar resulting 
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loudness measure. For example, the Log Power Spectrum 205 may be 
modified to output always a constant number of bands at a constant block 
rate by combining or averaging multiple smaller blocks into larger blocks 
and spreading the power from the smaller number of bands across the larger 
number of bands, Alternatively, the Measure Loudness may accept varying 
block sizes and adjust accordingly their filtering, excitation, specific 
loudness, averaging and summing processes, for example, by adjusting time 
constants. 

Weighted Power Measurement Example 
As an example of aspects of the present invention, a highly- 
economical version of a weighted power loudness measurement method may 
use Dolby Digital bitstreams and the weighted power loudness measure 
LeqA. In this highly-economical example, only the quantized exponents 
contained in a Dolby Digital bitstream are used as an estimate of the audio 
signal spectrum to perform the loudness measure. This avoids the additional 
computational requirements of performing bit allocations recreate the 
mantissa information, which would otherwise only provide a slightly more 
accurate estimate of the signal spectrum. 

As depicted in the examples of FIGS. 5 and 6a, the Dolby Digital 
bitstream is partially decoded to recreate and extract the log power spectrum, 
calculated from the quantized exponent data contained in the bitstream. 
Dolby Digital performs low-bitrate audio encoding by windowing 512 
consecutive, 50% overlapped PCM audio samples and performing an MDCT 
transform, resulting in 256 MDCT coefficients that are used to create the 
low-bitrate coded audio stream. The partial decoding performed in FIGS. 5 
and 6a unpacks the exponent data E(Ic) and converts the unpacked data to 
256 quantized log power spectrum values, P(k), which form a coarse spectral 
representation of the audio signal. The log power spectrum values, P(k), are 
in units of dB. The conversion is as follows 
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P{k) = -E{k) ■ 20 • log 10 (2) 0 < k < N (1) 

where N= 256, the number of transform coefficients for each block in a 
Dolby Digital bit stream. To use the log power spectrum in the computation 
of the weighted power measure of loudness, the log power spectrum is 

5 weighted using an appropriate loudness curve, such as one of the A-, B- or 
C-weighting curves shown in FIG. 4. In this case, the LeqA power measure 
is being computed and therefore the A-weighting curve is appropriate. The 
log power spectrum values P(Ic) are weighted by adding them to discrete, A- 
weighting frequency values, A w (lt), also in units of dB as 

10 P„,(k) = P(k) + A n ,(k) 0</c<N (2) 

The discrete A-weighting frequency values, A w (ty, are created by 
computing the A-weighting gain values for the discrete frequencies,^d iscre t e , 
where 

f. =^ + Fk 0</c<N (3) 

J discrete r\ — N ' 



1 5 where 



F = 0 < k < N (4) 

2-7V 



and Avhere the sampling frequency F s is typically equal to 48 kHz for Dolby 
Digital. Each set of weighted log power spectrum values, P w (k), are then 
converted from dB to linear power and summed to create the A-weighted 
20 power estimate P PO i^ the 5 12 PCM audio samples as 



A=0 



As stated previously, each Dolby Digital bitstream contains 
consecutive transforms created by windowing 512 PCM samples with 50% 
overlap and performing the MDCT transform. Therefore, an approximation 
25 of the total A-weighted power, P TOT , of the audio low-bitrate encoded in a 
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Dolby Digital bitstream may be computed by averaging the power values 
across all the transforms in the Dolby Digital bitstream as follows 

M nl =o 

where M equals the total number of transforms contained in the Dolby 
Digital bitstream. The average power is then converted to units of dB as 
follows. 

L A =\o-log [0 {P TOT )-C (7) 
where C is a constant offset due to level changes performed in the transform 
process during encoding of the Dolby Digital bitstream. 

Psychoacoustic Measurement Example 
As another example of aspects of the present invention, a highly- 
economical version of a weighted power loudness measurement method may 
use Dolby Digital bitstreams and a psychoacoustic loudness measure. In this 
highly-economical example, as in the previous one, only the quantized 
exponents contained in a Dolby Digital bitstream are used as an estimate of 
the audio signal spectrum to perform the loudness measure. As in the other 
example, this avoids the additional computational requirements of 
performing bit allocation to recreate the mantissa information, which would 
otherwise only provide a slightly more accurate estimate of the signal 
spectrum. 

International Patent Application No. PCT/US2004/0 16964, filed May 
27, 2004, Seefeldt et al, published as WO 2004/1 1 1994 A2 on December 23, 
2004, which application designates the United States, discloses, among other 
things, an objective measure of perceived loudness based on a 
psychoacoustic model. Said application is hereby incorporated by reference 
in its entirety. The log power spectrum values, P(7c>, derived from the partial 
decoding of a Dolby Digital bitstream may serve as inputs to a technique, 
such as in said international application, as well as other similar 
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psychoacoustic measures, rather than the original PCM audio. Such an 
arrangement is shown in the example of FIG. 6b. Borrowing terminology 
and notation from said PCT application, an excitation signal E(b) 
approximating the distribution of energy along the basilar membrane of the 
5 inner ear at critical band b may be approximated from the log power 
spectrum values as follows: 

E(b) = ^\T(kf\H b (kpO™'" (8) 

where T(k) represents the frequency response of the transmission filter and 
H b (lc) represents the frequency response of the basilar membrane at a 
l o location corresponding to critical band b, both responses being sampled at 
the frequency corresponding to transform bin k. Next the excitations 
corresponding to all transforms in the Dolby Digital bitstream are averaged 
to produce a total excitation: 

*w -^i;*(M»> (9) 

1 5 Using equal loudness contours, the total excitation at each band is 

transformed into an excitation level that generates the same loudness at 1 
kHz. Specific loudness, a measure of perceptual loudness distributed across 
frequency, is then computed from the transformed excitation, E mh (b) , 
through a compressive non-linearity: 



20 N(b) = G 



^5 



(10) 

where TQ >m is the threshold in quiet at 1kHz and the constants G and a are 
chosen to match data generated from psycho acoustic experiments describing 
rowth of loudness. Finally, the total loudness, L, represented in units of 
sone, is computed by summing the specific loudness across bands: 



the g 
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For the purposes of adjusting the audio signal, one may wish to 
compute a matching gain, G MaKh , which when multiplied with the audio 
signal makes the loudness of the adjusted audio equal to some reference 
loudness, L^ r , as measured by the described psychoacoustic technique. 
Because the psychoacoustic measure involves a non-linearity in the 
computation of specific loudness, a closed form solution for G Maleh does not 
exist. Instead, an interactive technique described in said PCT application 
may be employed in which the square of the matching gain is adjusted and 
multiplied with the total excitation, E(b) , until the corresponding total 
loudness, L, is within a threshold difference with respect to the reference 
loudness, . The loudness of the audio may then be expressed in dB with 
respect to the reference as: 



Aspects of the present invention are not limited to Dolby Digital, 
Dolby Digital Plus, and Dolby E coding systems. Audio signals coded using 
certain other coding systems in which an approximation of the power 
spectrum of the audio is provided by, for example, scale factors, spectral 
envelopes, and linear predictive coefficients that may be recovered from an 
encoded bitstream without fully decoding the bitstream to produce audio 
may also benefit from aspects of the present invention. 

Error in Calculating Power from Dolby Digital Exponents 

The Dolby Digital exponents E(k) represent a coarse quantization of 
the logarithm of the MDCT spectrum coefficients. There are a number of 
sources of error when using these values as a coarse power spectrum. 

First, in Dolby Digital, the quantization process itself results in mean 
error of approximately 2.7 dB when comparing the values of the power 




(12) 



Other Perceptual Audio Codecs 
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spectrum generated from the exponents (see Equation 1, above) and the 
power values calculated directly from the MDCT coefficients. This mean 
error, which was determined experimentally, may be incorporated into the 
constant offset C in Equation 7, above. 

5 Second, under certain signal conditions, such as transients, exponent 

values are grouped across frequency (referred to as "D25" and "D45" modes 
in the above-cited A/52A document). This grouping across frequency causes 
the mean exponent error to be less predictable, and thus more difficult to 
account for by incorporating into the constant C of Equation 7. In practice, 

10 error due to this grouping may be ignored for two reasons: (1) the grouping 
is used rarely and(2) the nature of the signals for which the grouping is used 
results in a measured mean error which is similar to the non-averaged case. 

Imp I em ei i ta tion 

The invention may be implemented in hardware or software, or a 
15 combination of both (e.g., programmable logic arrays). Unless otherwise 
specified, the algorithms or processes included as part of the invention are 
not inherently related to any particular computer or other apparatus. In 
particular, various general-purpose machines may be used with programs 
written in accordance with the teachings herein, or it may be more 
20 convenient to construct more specialized apparatus (e.g., integrated circuits) 
to perform the required method steps. Thus, the invention may be 
implemented in one or more computer programs executing on one or more 
programmable computer systems each comprising at least one processor, at 
least one data storage system (including volatile and non-volatile memory 
25 and/or storage elements), at least one input device or port, and at least one 
output device or port. Program code is applied to input data to perform the 
functions described herein and generate output information. The output 
information is applied to one or more output devices, in known fashion. 
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Each such program may be implemented in any desired computer 
language (including machine, assembly, or high level procedural, logical, or 
object oriented programming languages) to communicate with a computer 
system. In any case, the language may be a compiled or interpreted 
language. 

It will be appreciated that some steps or functions shown in the 
exemplary figures perform multiple substeps and may also be shown as 
multiple steps or functions rather than one step or function. It will, also be 
appreciated that various devices, functions, steps, and processes shown and 
described in various examples herein may be shown combined or separated 
in ways other than as shown in the various figures. For example, when 
implemented by computer software instruction sequences, various functions 
and steps of the exemplary figures may be implemented by multithreaded 
software instruction sequences running in suitable digital signal processing 
hardware, in which case the various devices and functions in the examples 
shown in the f igures may correspond to portions of the software instructions. 

Each such computer program is preferably stored on or downloaded to 
a storage media or device (e.g., solid state memory or media, or magnetic or 
optical media) readable by a general or special purpose programmable 
computer, for configuring and operating the computer when the storage 
media or device is read by the computer system to perform the procedures 
described herein. The inventive system may also be considered to be 
implemented as a computer-readable storage medium, configured with a 
computer program, where the storage medium so configured causes a 
computer system to operate in a specific and predefined manner to perform 
the functions described herein. 

A number of embodiments of the invention have been described. 
Nevertheless, it will be understood that various modifications may be made 
without departing from the spirit and scope of the invention. For example, 
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some of the steps described herein may be order independent, and thus can 
be performed in an order different from that described. 



I 



JPCT/US2006/010823 



10 



15 



I 

WO 2006/113047 ! PCT/US2006/0 10823 



-22- 

i 

Claims] 

■ 

i 

i 

1 . A method for measuring the loudness of audio encoded in a 

bitstream that includes data from which an approximation of the power 

i 

spectrum of the audio can be derived without fully decoding the audio, 

comprising \ 

i 

deriving said approximation of the power spectrum of the audio from 

** » 

said bitstream without fully decoding the audio, and 

determining an approximate loudness of the audio in response to the 

i 

approximation of the power spectrum of tlie audio. 



2. A method according to claim 1 wherein said data includes coarse 

! 

representations of the audio and associate4 finer representations of the audio, 
and wherein said approximation of the power spectrum of the audio is 
derived from the coarse representations ofjthe audio. 



3. A method according to claim 2 Wherein the audio encoded in a 
bitstream is subband encoded audio having a plurality of frequency 

i 

subbands, each subband having a scale facjtor and sample data associated 
therewith, and wherein the coarse representations of the audio comprise scale 

i 
I 
i 

20 factors and the associated finer representations of the audio comprise sample 

T 

I 

* 

data associated with each scale factor. 



4. A method according to claim 3 wherein the scale factor and sample 
data of each subband represent spectral coefficients in the subband by 
25 exponential notation in which the scale factor comprises an exponent and the 
associated sample data comprises mantissas. 



5. A method according to any of claims 1-4 wherein said bitstream is 

i 

an AC-3 encoded bitstream. 
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6. A method according to claim 2 wherein the audio encoded in a 
bitstream is linear predictive coded audio in which the coarse representations 
of the audio comprise linear predictive coefficients and the finer 
representations of the audio comprise excitation information associated with 
the linear predictive coefficients. 

7. A method according to claim 2 wherein the coarse representations 
of the audio comprise at least one spectral envelope and the finer 
representations of the audio comprise spectral components associated with 
said at least one spectral envelope. 

8. A method according to any of claims 1-7 wherein determining an 
approximate loudness of the audio in response to the approximation of the 
power spectrum of the audio includes applying a weighted power loudness 
measure. 

9. A method according to claim 8 in which the weighted power 
loudness measure employs a filter that deemphasizes less perceptible 
frequencies and averages the power of the filtered audio over time. 

10. A method according to any of claims 1-7 wherein determining an 
approximate loudness of the audio in response to the approximation of the 
power spectrum of the audio includes applying a psychoacoustic loudness 

measure. 



1 1 . A method according to claim 10 in which the psychoacoustic 
loudness measure employs a model of the human ear to determine specific 



10 



15 



PO7US2006/010823 

WO 2006/113047 PCT/US2006/010823 

-24- 

loudness in each of a plurality of frequency bands similar to the critical 
bands of the human ear. 

12. A method according to any of claims 3-5 wherein determining an 
approximate loudness of the audio in response to the approximation of the 
power spectrum of the audio includes applying a psychoacoustic loudness 
measure. 

13. A method according to claim 12 in which said subbands are 
similar to the critical bands of the human ear and the psychoacoustic 
loudness measure employs a model of the human ear to determine specific 
loudness in each of said subbands. 

14. Apparatus for measuring the loudness of audio encoded in a 
bitstream that includes data from which an approximation of the power 
spectrum of the audio can be derived without fully decoding the audio, 



comprising 

means for deriving said approximation of the power spectrum of the 
audio from said bitstream without fully decoding the audio, and 
20 means for determining an approximate loudness of the audio in 

response to the approximation of the power spectrum of the audio. 

15. Apparatus according to claim 14 wherein said data includes 
ree representations of the audio and associated finer representations of the 

25 audio, and wherein said approximation of the power spectrum of the audio is 
derived from the coarse representations of the audio. 

■ 

16. Apparatus according to claim 1 5 wherein the audio encoded in a 
bitstream is subband encoded audio having a plurality of frequency 



coarse 
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subbands, each subband having a scale factor and sample data associated 
therewith, and wherein the coarse representations of the audio comprise scale 
factors and the associated finer representations of the audio comprise sample 
data associated with each scale factor. 

17. Apparatus according to claim 16 wherein the scale factor and 
sample data of each subband represent spectral coefficients in the subband 
by exponential notation in which the scale factor comprises an exponent and 
the associated sample data comprises mantissas. 

18. Apparatus according to any of claims 14-17 wherein said 
bitstream is an AC-3 encoded bitstream. 

1 9. Apparatus according to claim 1 5 wherein the audio encoded in a 
bitstream is linear predictive coded audio in which the coarse representations 
of the audio comprise linear predictive coefficients and the finer 
representations of the audio comprise excitation information associated with 
the linear predictive coefficients. 

20. Apparatus according to claim 15 wherein the coarse 
representations of the audio comprise at least one spectral envelope and the 
finer representations of the audio comprise spectral components associated 
with said at least one spectral envelope. 

21. Apparatus according to any of claims 14-20 wherein said means 
for determining an approximate loudness of the audio in response to the 
approximation of the power spectrum of the audio includes means for 
applying a weighted power loudness measure. 
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22. Apparatus according to claim 21 in which the weighted power 
loudness measure employs a filter that deemphasizes less perceptible 
frequencies and averages the power of the filtered audio over time. 

5 23. Apparatus according to any of claims 14-20 wherein said means 

for determining an approximate loudness of the audio in response to the 
approximation of the power spectrum of the audio includes means for 
applying a psychoacoustic loudness measure. 

10 24. Apparatus according to claim 23 in which the psychoacoustic 

loudness measure employs a model of the human ear to determine specific 
loudness in each of a plurality of frequency bands similar to the critical 
bands of the human ear. 

15 25. Apparatus according to any of claims 16-18 wherein said means 

for determining an approximate loudness of the audio in response to the 
approximation of the power spectrum of the audio includes means for 
applying a psychoacoustic loudness measure. 

20 26. Apparatus according to claim 25 in which said subbands are 

similar to the critical bands of the human ear and the psychoacoustic 
loudness measure employs a model of the human ear to determine specific 
loudness in each of said subbands. 

25 27. Apparatus adapted to perform the methods of any one of claims 1 

through 13. 
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28. A computer program, stored on a computer-readable medium for 
causing a computer to perform the methods of any one of claims 1 through 
13. 
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Box No. I Basis of the opinion 



1 . With regard to the language, this opinion has been established on the basis of: 
E3 the international application in the language in which it was filed 

□ a translation of the international application into , which is the language of a translation furnished for the 
purposes of international search (Rules 12.3(a) and 23.1 (b)). 

2. With regard to any nucleotide and/or amino acid sequence disclosed in the international application and 
necessary to the claimed invention, this opinion has been established on the basis of: 

4 

a. type of material: 

□ a sequence listing 

□ table (s) related to the sequence listing 

b. format of material: 

□ on paper 

□ in electronic form 

c. time of filing/furnishing: 

■ 

□ contained in the international application as filed. 

□ filed together with the international application in electronic form. 

□ furnished subsequently to this Authority for the purposes of search. 

3. □ In addition, in the case that more than one version or copy of a sequence listing and/br table relating thereto 

has been filed or furnished, the required statements that the information in the subsequent or additional 
copies is identical to that in the application as filed or does not go beyond the application as filed, as 
appropriate, were furnished. 

4. Additional comments: 



Box No. II Priority 

1 . H The validity of the priority claim has not been considered because the International Searching Authority 

does not have in its possession a copy of the earlier application whose priority has been claimed or, where 
required, a translation of that earlier application. This opinion has nevertheless been established on the 
assumption that the relevant date (Rules 43bisA and 64.1) is the claimed priority date. 

2. □ This opinion has been established as if no priority had been claimed due to the fact that the priority claim 

has been found invalid (Rules 43/?/s.1 and 64.1). thus for the purposes of this opinion, the international 
filing date indicated above is considered to be the relevant date. 

3. Additional observations, if necessary: 
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Inventive step (IS) 


Yes: 
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No: 


Claims 


Industrial applicability (IA) 


Yes: 
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2. Citations and explanations 
see separate sheet 



Box No. VII Certain defects in the international application 



The following defects in the form or contents of the international application have been noted: 
see separate sheet 



Box No. VIII Certain observations on the international application 



The following observations on the clarity of the claims, description, and drawings or on the question whether the 
claims are fully supported by the description, are made: 



see separate sheet 
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Re Item V 

Reasoned statement with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 

1 . Reference is made to the following documents: 
D1: US 2001/027393 A1 

D2: US 2004/1 84537 A1 
D3: WO 2004/0731 78 A 
D4: US-B1-6 430 533 

D5: SMITH P J ET AL: "TANDEM-FREE VOIP CONFERENCING: A BRIDGE TO 
NEXT-GENERATION NETWORKS" IEEE COMMUNICATIONS MAGAZINE, 
IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 41, no. 5, May 2003 (2003- 
05), pages 136-145, XP001166417 

2. Lack of novelty. 

The present application does not meet the criteria of Article 33(1) PCT, because the 
subject matter of claims 1,14 and 27 is not new in the sense of Article 33(2) PCT. 

2.1 Independent claim 1 

The document D1 discloses (the references in parentheses applying to this 
document): 

a method for measuring the loudness of audio encoded in a bitstream that includes 
data from which an approximation of the power spectrum of the audio can be derived 
without fully decoding the audio (para. [0017]-[0023]; para. [0048]-[0077]), comprising 
deriving said approximation of the power spectrum of the audio from said 
bitstream without fully decoding the audio (para. [0023]; para. [0052]-[0053]), 

and 

determining an approximate loudness of the audio in response to the 
approximation of the power spectrum of the audio (para. [0023]: the calculation 
of the total energy in each frequency band implies calculation of the loudness; 
see also para. [0027]: the term "masking" implicitly discloses the notion of 
"loudness"). 

2.2 Independent claims 14 and 27 

These claims essentially define an apparatus which corresponds closely to the 
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method of independent claim 1 in that it defines a respective structural feature for 
each method step of claim 1 . Such an apparatus is implicitly disclosed by the 
document D1 because it defines nothing but the features directly implementing the 
method of claim 1 . 

3. Lack of inventive step. 

The present application does not meet the criteria of Article 33(1 ) PCT, because the 
subject-matter of claims 5, 8-1 1 , 18 and 21-24 does not involve an inventive step in 
the sense of Article 33(3) PCT. 

3.1 Independent claim 28 

The mere implementation by means of software of a general method, such as the 
method of claim 1 , is very common nowadays and does not render the subject matter 
of the claim inventive. 

3.2 Dependent claims 

Claim 5: D1 (para. [0057]) suggests to do the encoding by means of the "MPEG-4 
AAC" coding scheme. However, the AC-3 coding algorithm is an alternative which is 
well-known in the art. 

Claims 8-9: as admitted by the applicant (p. 2, I. 18-26), weighted power measures 
are well-known in the art to measure the perceived loudness. 

Claims 10-11: as admitted by the applicant (p. 2, I. 26 - p. 3, I. 6), psycho-acoustic 
measures are well-known in the art to measure the perceived loudness. 

Claims 18, 21-24: see respective claims 5 and 8-1 1 . 

4. Positive statement. 

No objections with respect to novelty, Art. 33(2) PCT, and/or inventive step, Art. 33(3) 
PCT, are raised with respect to claims 2-4, 6-7, 12-13, 15-17, 19-20 and 25-26. 

4.1 Claim 2 

a. Claim 2 further specifies that the data of claim 1 includes coarse representations and 
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associated finer representations. 

b. Novelty: this feature is not disclosed in the closest prior art document D1 , which 
shows an audio conferencing system having N terminals (para. [0038]) with N 
corresponding decoders (para. [0047]) in which a partial decoding in the form of 
inverse quantization is performed (para. [0052]). The inverse-quantized values are 
used to calculate the total energy in each frequency band (para. [0023] and [0053]), 
after which a recoding is performed (para. [0056]). 

c. Technical effect: the use of coarse and associated finer representations allows for 
more flexibility in the computational cost of the decoding algorithm. 

d. The solution which is specified in claim 2 in order to achieve the effect of point c is 
inventive, Art. 33(3) PCT, for the reason that none of the documents cited in the 
international search report points in the direction of combining the features as 
mentioned in point a. In particular: 

i) In D1 (para. [0052]) itself the term "partial decoding" refers to the fact that the 
decoding process is not performed completely: only an inverse quantization 
process is made whereas other functions which are normally performed in a full- 
decoding algorithm (e.g. inverse transformation to the time domain) are not 
present. D1 does not hint at applying a layered or scalable approach as in claim 
2. 

ii) From D2 (para. [001 6]-[001 7]; para. [0025]-[0027]) it is known that a typical 
scalable coding algorithm takes the highest order bits of all quantized spectral 
values, subjects them to arithmetic coding and writes them into the bitstream as 
a first scaling layer. Next, a second scaling layer is formed by means of the 
second highest order bits. The decoding algorithm then, of course, performs the 
corresponding inverse steps. 

However, the skilled person would not apply this scalable approach to D1 , since 
the (de)coding methods in D1 and D2 are fundamentally different. As already 
pointed out in i) above, the partial decoding procedure in D1 stops at the 
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dequantization level (which suffices to determine the total energy of each band) 
and does not perform an inverse arithmetic coding. Application of the method of 
D2 to that of D1 would entail an unnecessary complication of the algorithm of 
D1 , which the skilled person would refrain from. 

iii) The documents D3-D4 are only relevant as background information on partial 
AC-3 decoding. 

iv) D5 (p. 1 39, left column) discloses partial decoding of a bitstream in a VoIP 
application to monitor gain (i.e. loudness) and spectral parameters in the 
bitstream, without mentioning how the partial decoding is done or how the gain 
is obtained. 

v) Combination of any of the documents cited in the international search report 
and general knowledge does not lead to the subject matter of claim 2 either. 

4.2 Claim 1 5 refers to an apparatus for implementing the method of claim 2 and is new 
and inventive for the same reasons as given for claim 2. 

4.3 The claims 3-4, 6-7 and 12-13 are dependent on claim 2 and claims 16-17, 19-20, 
25-26 are dependent on claim 15. Therefore they also fulfil the requirements of the 
PCT with respect to novelty and inventive step. 

5. All claims fulfill the requirement with respect to industrial applicability, Art. 33(4) PCT, 
for obvious reasons. 

Re Item VII: Form or content of the application 

The features of the claims are not provided with reference signs placed in parentheses 
(Rule 6.2(b) PCT). 

Contrary to the requirements of Rule 5.1 (a)(ii) PCT, the relevant background art disclosed 
in document D1 is not mentioned in the description, nor is this document identified therein. 
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The document mentioned on p. 7, I. 25-29 could not be retrieved unambiguously at the 
time of search and is not considered in this opinion. 

Re Item VIII: Reasoned statement with regard to clarity, Art. 6 PCT. 

The statement in the description on page 20, I. 29 ("the spirit [...] of the invention") implies 
that the subject matter for which protection is sought may be different to that defined by 
the claims, thereby resulting in lack of clarity (Article 6 PCT) when used to interpret them. 

Claim 1 0, taken to refer back to claims 3-5, has the same wording as claim 12. Similarly, 
claim 23, referring back to claims 16-18, has the same wording as claim 25. Hence these 
claims lack conciseness. 

Although the apparatus claims 14 and 27 have been drafted as separate independent 
claims, these apparatus claims appear to relate effectively to the same subject matter and 
to differ from each other only in respect of the terminology used for the features of that 
subject matter. The aforementioned claims therefore lack conciseness and as such do not 
meet the requirements of Article 6 PCT. 



■ 
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